693 research outputs found
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Using deep neural nets as function approximator for reinforcement learning
tasks have recently been shown to be very powerful for solving problems
approaching real-world complexity. Using these results as a benchmark, we
discuss the role that the discount factor may play in the quality of the
learning process of a deep Q-network (DQN). When the discount factor
progressively increases up to its final value, we empirically show that it is
possible to significantly reduce the number of learning steps. When used in
conjunction with a varying learning rate, we empirically show that it
outperforms original DQN on several experiments. We relate this phenomenon with
the instabilities of neural networks when they are used in an approximate
Dynamic Programming setting. We also describe the possibility to fall within a
local optimum during the learning process, thus connecting our discussion with
the exploration/exploitation dilemma.Comment: NIPS 2015 Deep Reinforcement Learning Worksho
Min Max Generalization for Two-stage Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes
We study the minmax optimization problem introduced in [22] for computing
policies for batch mode reinforcement learning in a deterministic setting.
First, we show that this problem is NP-hard. In the two-stage case, we provide
two relaxation schemes. The first relaxation scheme works by dropping some
constraints in order to obtain a problem that is solvable in polynomial time.
The second relaxation scheme, based on a Lagrangian relaxation where all
constraints are dualized, leads to a conic quadratic programming problem. We
also theoretically prove and empirically illustrate that both relaxation
schemes provide better results than those given in [22]
Benchmarking for Bayesian Reinforcement Learning
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise
the collected rewards while interacting with their environment while using some
prior knowledge that is accessed beforehand. Many BRL algorithms have already
been proposed, but even though a few toy examples exist in the literature,
there are still no extensive or rigorous benchmarks to compare them. The paper
addresses this problem, and provides a new BRL comparison methodology along
with the corresponding open source library. In this methodology, a comparison
criterion that measures the performance of algorithms on large sets of Markov
Decision Processes (MDPs) drawn from some probability distributions is defined.
In order to enable the comparison of non-anytime algorithms, our methodology
also includes a detailed analysis of the computation time requirement of each
algorithm. Our library is released with all source code and documentation: it
includes three test problems, each of which has two different prior
distributions, and seven state-of-the-art RL algorithms. Finally, our library
is illustrated by comparing all the available algorithms and the results are
discussed.Comment: 37 page
On overfitting and asymptotic bias in batch reinforcement learning with partial observability
This paper provides an analysis of the tradeoff between asymptotic bias
(suboptimality with unlimited data) and overfitting (additional suboptimality
due to limited data) in the context of reinforcement learning with partial
observability. Our theoretical analysis formally characterizes that while
potentially increasing the asymptotic bias, a smaller state representation
decreases the risk of overfitting. This analysis relies on expressing the
quality of a state representation by bounding L1 error terms of the associated
belief states. Theoretical results are empirically illustrated when the state
representation is a truncated history of observations, both on synthetic POMDPs
and on a large-scale POMDP in the context of smartgrids, with real-world data.
Finally, similarly to known results in the fully observable setting, we also
briefly discuss and empirically illustrate how using function approximators and
adapting the discount factor may enhance the tradeoff between asymptotic bias
and overfitting in the partially observable context.Comment: Accepted at the Journal of Artificial Intelligence Research (JAIR) -
31 page
Cybersecurity in Power Grids: Challenges and Opportunities
Increasing volatilities within power transmission and distribution force power grid operators to amplify their use of communication infrastructure to monitor and control their grid. The resulting increase in communication creates a larger attack surface for malicious actors. Indeed, cyber attacks on power grids have already succeeded in causing temporary, large-scale blackouts in the recent past. In this paper, we analyze the communication infrastructure of power grids to derive resulting fundamental challenges of power grids with respect to cybersecurity. Based on these challenges, we identify a broad set of resulting attack vectors and attack scenarios that threaten the security of power grids. To address these challenges, we propose to rely on a defense-in-depth strategy, which encompasses measures for (i) device and application security, (ii) network security, and (iii) physical security, as well as (iv) policies, procedures, and awareness. For each of these categories, we distill and discuss a comprehensive set of state-of-the art approaches, as well as identify further opportunities to strengthen cybersecurity in interconnected power grids
Thin crystalline macroporous silicon solar cells with ion implanted emitter
We separate a (34 ± 2) μm-thick macroporous Si layer from an n-type Si wafer by means of electrochemical etching. The porosity is p = (26.2 ± 2.4)%. We use ion implantation to selectively dope the outer surfaces of the macroporous Si layer. No masking of the surface is required. The pores are open during the implantation process. We fabricate a macroporous Si solar cell with an implanted boron emitter at the front side and an implanted phosphorus region at the rear side. The short-circuit current density is 34.8 mA cm-2 and the open-circuit voltage is 562 mV. With a fill factor of 69.1% the cell achieves an energy-conversion efficiency of 13.5%.Federal Ministry for Environment, Nature Conservation, and Nuclear Safety/FKZ 032514
Multiple Slips in Atomic-Scale Friction: An Indicator for the Lateral Contact Damping
The occurrence of multiple jumps in 2D atomic-scale friction measurements is used to quantify the viscous damping accompanying the stick-slip motion of a sharp tip in contact with a NaCl(001) surface. Multiple slips are observed without apparent wear for normal forces between 13 and 91nN. For scans parallel to [100] directions, the tip jumps between minima of the substrate corrugation potential in a zigzag fashion. An algorithm is applied to determine histograms of lateral force jumps which characterize multiple slips. The same algorithm is used to classify multiple slips occurring in calculated lateral force maps. Comparisons between simulations and experiments indicate that the nanometer-sized contact is underdamped at intermediate loads (13-26nN) and becomes slightly overdamped at higher loads. The proposed procedure is a novel way to estimate the lateral contact damping which plays an important role in the interpretation of measurements of the velocity and temperature dependence of friction, of slip duration, and of the reduction of friction by applied perpendicular or parallel oscillation
9.糖尿病患者におけるグラム陰性桿菌敗血症の2症例(第585回千葉医学会例会・第1内科教室同門会例会)
<p>Offline computation cost Vs. Performance (inaccurate case).</p
A bibliometric analysis of orthogeriatric care: top 50 articles.
BACKGROUND
Population is ageing and orthogeriatric care is an emerging research topic.
PURPOSE
This bibliometric review aims to provide an overview, to investigate the status and trends in research in the field of orthogeriatric care of the most influential literature.
METHODS
From the Core Collection databases in the Thomson Reuters Web of Knowledge, the most influential original articles with reference to orthogeriatric care were identified in December 2020 using a multistep approach. A total of 50 articles were included and analysed in this bibliometric review.
RESULTS
The 50 most cited articles were published between 1983 and 2017. The number of total citations per article ranged from 34 to 704 citations (mean citations per article: n = 93). Articles were published in 34 different journals between 1983 and 2017. In the majority of publications, geriatricians (62%) accounted for the first authorship, followed by others (20%) and (orthopaedic) surgeons (18%). Articles mostly originated from Europe (76%), followed by Asia-pacific (16%) and Northern America (8%). Key countries (UK, Sweden, and Spain) and key topic (hip fracture) are key drivers in the orthogeriatric research. The majority of articles reported about therapeutic studies (62%).
CONCLUSION
This bibliometric review acknowledges recent research. Orthogeriatric care is an emerging research topic in which surgeons have a potential to contribute and other topics such as intraoperative procedures, fractures other than hip fractures or elective surgery are related topics with the potential for widening the field to research
- …